A Tutorial on Thompson Sampling

نویسندگان

  • Daniel Russo
  • Benjamin Van Roy
  • Abbas Kazerouni
  • Ian Osband
چکیده

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use. This tutorial covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, dynamic pricing, recommendation, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. We will also discuss when and why Thompson sampling is or is not effective and relations to alternative algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Horvitz-Thompson estimator of population mean under inverse sampling designs

Inverse sampling design is generally considered to be appropriate technique when the population is divided into two subpopulations, one of which contains only few units. In this paper, we derive the Horvitz-Thompson estimator for the population mean under inverse sampling designs, where subpopulation sizes are known. We then introduce an alternative unbiased estimator, corresponding to post-st...

متن کامل

Development and Usability Evaluation of an Online Tutorial for “How to Write a Proposal” for Medical Sciences Students

Background and Objective: Considering the importance of learning how to write a proposal for students, this study was performed to develop an online tutorial for “How to write a Proposal” for students and to evaluate its usability. Methods: This study is a developmental research and tool design. “Gamified Online Tutorial based on Self-Determination Theory (GOT-STD) Framework" became the basis f...

متن کامل

Mcmc in the Analysis of Genetic Data on Pedigrees

This chapter provides a tutorial introduction to the use of MCMC in the analysis of data observed for multiple genetic loci on members of extended pedigrees in which there are many missing data. We introduce the specification of pedigrees and inheritance, and the structure of genetic models defining the dependence structure of data. We review exact computational algorithms which can provide a p...

متن کامل

A tutorial on Quasi-experimental designs

A main step in answering a scientific hypothesis in an epidemiological study is deciding which type of study is suitable to be undertaken, considering methodology, practical considerations and budget and time limitations

متن کامل

An Interactive Tutorial for Teaching Statistical Power

This paper describes an interactive Web-based tutorial that supplements instruction on statistical power. This freely available tutorial provides several interactive exercises that guide students as they draw multiple samples from various populations and compare results for populations with differing parameters (for example, small standard deviation versus large standard deviation). The tutoria...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1707.02038  شماره 

صفحات  -

تاریخ انتشار 2017